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Background / Context: 

As part of the 2010 economic stimulus, a $55 million “Investing in Innovation” (i3) grant from 
the US Department of Education was awarded to scale up Reading Recovery across the nation. 
This paper presents the final round of results from the large-scale, mixed methods randomized 
evaluation of the implementation and impacts of Reading Recovery under this i3 Scale-Up. The 
large positive effects on student reading performance found during the first three years of this 
study also revealed substantial variation in impacts across schools and contexts. In addition, 
results from the mixed-methods implementation evaluation revealed numerous programmatic 
and contextual factors that can support or hinder implementation of Reading Recovery by 
individual schools and may explain variation in impacts across sites. 

Purpose / Objective / Research Question / Focus of Study: 

While prior research on Reading Recovery shows that the program’s impacts on student 
achievement are often large, research also suggests that there is substantial variability in impacts, 
and that much of this can be attributed to variation in program implementation (Ashdown & 
Simic, 2000; Center et al, 1995; D’Agostino & Murphy, 2004; Iversen & Tunmer, 1993; May, 
Gray, Gillespie, Sirinides, Sam, Goldsworthy, Armijo, & Tognatta, 2013; Neal & Kelly, 1999; 
Pinnell et al, 1994; Schwartz, 2005; Pinnell, 1989; Quay et al, 2001; Rodgers et al, 2004 and 
2005). The evaluation of the i3-funded scale-up of Reading Recovery includes a rigorous mixed- 
methods research design which supports strong causal inferences about program impacts and 
provides rich descriptions of program implementation, including analysis of individual and 
contextual factors that may explain variation in program impacts when implemented at scale. 

Setting: 

This i3 scale-up evaluation involves several hundred elementary schools across the nation. 

Population / Participants / Subjects: 

Over 10,000 low performing first-grade students from more than 1,000 schools will have 
participated in this study from 2010-2015. 

Intervention / Program / Practice: 

Reading Recovery involves intense one-on-one reading instruction provided to the lowest- 
achieving first graders in a school. These students receive 12- to 20- week cycles of daily, 30- 
minute, one-on-one RR sessions. Student progress is monitored daily to ensure that instruction is 
responsive to changes in student achievement and needs. 
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Research Design: 

The evaluation design for the i3 scale-up of Reading Recovery includes a rigorous experimental 
research design (i.e., a multi-site randomized controlled trial) that supports strong causal 
inferences about program impacts, coupled with mixed methods descriptions of program 
implementation and contextual factors. Short-term impacts on students’ reading performance are 
estimated by comparing mid-year reading achievement of students randomly assigned to 
participate in reading recovery at the beginning of first grade to students randomly assigned to 
the control condition. 

Data Collection and Analysis: 

The impact of Reading Recovery on mid-year reading achievement of the treatment and control 
students is estimated using a three-level hierarchical linear model (HLM) (Raudenbush & Bryk, 
2002), with students nested within blocks (i.e., matched pairs) and blocks nested within 
participating schools. This HLM includes pretest scores as a covariate, along with random effects 
for blocks, a random effect for overall school performance (i.e., random school intercepts), and a 
random effect for the impact of Reading Recovery (i.e., random treatment effects across 
schools). The primary impact analyses utilizes the reading words and reading comprehension 
subscales from the Iowa Tests of Basic Skills (TTBS) as the posttest outcome measure and scores 
from the Observation Survey of Early Literacy Achievement (OS) as the pretest covariate. 

The enormous sample size for these multi-site studies allows for additional school-level 
contextual analysis of factors associated with variability in program effects. The use of random 
treatment effects for schools in a multilevel modeling framework allows for estimation of cross- 
level interactions to explain variability in treatment effects. School-level data from both 
quantitative and qualitative sources are used as predictors of school-level variability in impact 
estimates, while student demographic variables are used to investigate variability in effects 
across subgroups of students. 

Findings / Results: 

Results from the first two years of the randomized experiment suggest large positive effects on 
reading achievement (May, Gray, Gillespie, Sirinides, Sam, Goldsworthy, Armijo, & Tognatta, 
2013; May, Sirinides, Gray, Armijo, Gillespie, Goldsworthy, Sam, & Blalock, 2014). During the 
first two years of the evaluation, a total of 2,296 students from 380 schools were randomly 
assigned to treatment and control conditions. Baseline equivalence was confirmed for gender, 
race, English Language Learner status, and prior text reading level. HLM analyses revealed 
significant overall effects of Reading Recovery between .40 and .60 standard deviations for the 
ITBS Reading Words subscale and between .36 and .61 standard deviations for the ITBS 
Reading Comprehension subscale. Additional significant effects were found for the variance of 
treatment effects across schools, translating to 90% plausible value intervals that were 
plus/minus over one-half of a standard deviation. 

Data from years 3 and 4 of the RCT will be pooled with data from the first two years in order to 
explore variation in impacts across schools. Although the site- specific impact estimate for an 
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individual school is imprecise (i.e., with only eight students per school), the pooled data will 
include more than 1,000 schools, which will increase power for identifying predictors of impact 
variation despite the unreliability of site-specific impact estimates. Potential predictors fall into 
five groups: implementation fidelity, RR teacher strength (i.e., instructional dexterity, growth- 
mindedness), school-level enactment of Reading Recovery (e.g. level and years of 
implementation), alignment of instructional approach (e.g. use of leveled texts, use of basal- 
based literacy approach, use of small group instructional settings) and integration with school 
systems and processes (e.g. extent to which RR data is used in instructional planning, frequency 
and nature of communication between the RR teacher and 1st grade teachers, principal 
“ownership” of reading recovery, site coordinator engagement). Indicators of implementation 
fidelity are informed by previous papers as well as the Reading Recovery Standards and 
Guidelines, which document requirements and recommendations for successful implementation. 

Conclusions: 

The consistently large positive impacts of Reading Recovery under the i3 scale-up suggest that 
this relatively large investment has led to substantial improvements in the reading performance 
of many thousands of students across the nation. It also serves as a point of validation for the 
Investing in Innovation (i3) program model — the size of an investment in an educational 
intervention should be proportionate to its prior evidence of effects. 
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